Example: Hyperparameter tuning¶
This example shows an advanced example on how to optimize your model's hyperparameters for multi-metric runs.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
In [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
In [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
<< ================== ATOM ================== >> Configuration ==================== >> Algorithm task: Binary classification. Parallel processing with 4 cores. Parallelization backend: loky Dataset stats ==================== >> Shape: (569, 31) Train set size: 456 Test set size: 113 ------------------------------------- Memory: 141.24 kB Scaled: False Outlier values: 167 (1.2%)
In [4]:
Copied!
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
}
)
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
}
)
Training ========================= >> Models: MLP Metric: f1, ap Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ------- | ------- | ---------- | ------- | -------- | | 0 | 3 | 17 | 10 | 2 | 0.9464 | 0.9464 | 0.9844 | 0.9844 | 9.139s | 9.139s | COMPLETE | | 1 | 2 | 11 | 12 | 3 | 0.9744 | 0.9744 | 0.9991 | 0.9991 | 11.466s | 20.605s | COMPLETE | | 2 | 3 | 15 | 14 | 4 | 0.9915 | 0.9915 | 0.9978 | 0.9991 | 8.570s | 29.175s | COMPLETE | | 3 | 2 | 19 | 10 | 4 | 0.9655 | 0.9915 | 0.9878 | 0.9991 | 9.208s | 38.383s | COMPLETE | | 4 | 3 | 16 | 11 | 2 | 0.9661 | 0.9915 | 0.9981 | 0.9991 | 0.657s | 39.039s | COMPLETE | | 5 | 4 | 20 | 13 | 4 | 0.9739 | 0.9915 | 0.9989 | 0.9991 | 0.623s | 39.662s | COMPLETE | | 6 | 4 | 19 | 10 | 2 | 0.9828 | 0.9915 | 0.9907 | 0.9991 | 0.601s | 40.263s | COMPLETE | | 7 | 2 | 19 | 11 | 3 | 0.7733 | 0.9915 | 0.9997 | 0.9997 | 0.601s | 40.863s | COMPLETE | | 8 | 4 | 15 | 17 | 2 | 0.9915 | 0.9915 | 0.9997 | 0.9997 | 0.601s | 41.464s | COMPLETE | | 9 | 4 | 19 | 10 | 4 | 0.9828 | 0.9915 | 0.9822 | 0.9997 | 0.599s | 42.062s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 8 Best parameters: --> hidden_layer_sizes: (4, 15, 17, 2) Best evaluation --> f1: 0.9915 ap: 0.9997 Time elapsed: 42.062s Fit --------------------------------------------- Train evaluation --> f1: 0.9965 ap: 0.9991 Test evaluation --> f1: 0.9718 ap: 0.9938 Time elapsed: 1.515s ------------------------------------------------- Time: 43.578s Final results ==================== >> Total time: 43.815s ------------------------------------- MultiLayerPerceptron --> f1: 0.9718 ap: 0.9938
In [5]:
Copied!
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
Out[5]:
FrozenTrial(number=8, state=1, values=[0.9914529914529915, 0.9997077732320282], datetime_start=datetime.datetime(2023, 11, 4, 19, 13, 50, 113304), datetime_complete=datetime.datetime(2023, 11, 4, 19, 13, 50, 713850), params={'hidden_layer_1': 4, 'hidden_layer_2': 15, 'hidden_layer_3': 17, 'hidden_layer_4': 2}, user_attrs={'estimator': MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2), random_state=1)}, system_attrs={'nsga2:generation': 0}, intermediate_values={}, distributions={'hidden_layer_1': IntDistribution(high=4, log=False, low=2, step=1), 'hidden_layer_2': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_3': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_4': IntDistribution(high=4, log=False, low=2, step=1)}, trial_id=8, value=None)
In [6]:
Copied!
atom.plot_pareto_front()
atom.plot_pareto_front()
In [7]:
Copied!
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ------- | ------- | ---------- | ------- | -------- | | 10 | 4 | 18 | 13 | 4 | 0.9831 | 0.9915 | 0.9997 | 0.9997 | 0.673s | 42.735s | COMPLETE | | 11 | 2 | 14 | 19 | 2 | 0.9421 | 0.9915 | 0.9899 | 0.9997 | 0.604s | 43.339s | COMPLETE | | 12 | 2 | 11 | 10 | 4 | 0.7733 | 0.9915 | 0.99 | 0.9997 | 0.617s | 43.955s | COMPLETE | | 13 | 2 | 12 | 15 | 2 | 0.9558 | 0.9915 | 0.9985 | 0.9997 | 0.595s | 44.550s | COMPLETE | | 14 | 3 | 11 | 16 | 4 | 0.7733 | 0.9915 | 0.9721 | 0.9997 | 0.663s | 45.212s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 8 Best parameters: --> hidden_layer_sizes: (4, 15, 17, 2) Best evaluation --> f1: 0.9915 ap: 0.9997 Time elapsed: 45.212s
In [8]:
Copied!
# The trials attribute gives an overview of the trial results
atom.mlp.trials
# The trials attribute gives an overview of the trial results
atom.mlp.trials
Out[8]:
| hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | estimator | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trial | ||||||||||||
| 0 | 3 | 17 | 10 | 2 | MLPClassifier(hidden_layer_sizes=(3, 17, 10, 2... | 0.946429 | 0.991453 | 0.984402 | 0.999708 | 9.138911 | 9.138911 | COMPLETE |
| 1 | 2 | 11 | 12 | 3 | MLPClassifier(hidden_layer_sizes=(2, 11, 12, 3... | 0.974359 | 0.991453 | 0.999128 | 0.999708 | 11.466475 | 20.605386 | COMPLETE |
| 2 | 3 | 15 | 14 | 4 | MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4... | 0.991453 | 0.991453 | 0.997842 | 0.999708 | 8.569545 | 29.174931 | COMPLETE |
| 3 | 2 | 19 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(2, 19, 10, 4... | 0.965517 | 0.991453 | 0.987805 | 0.999708 | 9.207920 | 38.382851 | COMPLETE |
| 4 | 3 | 16 | 11 | 2 | MLPClassifier(hidden_layer_sizes=(3, 16, 11, 2... | 0.966102 | 0.991453 | 0.998086 | 0.999708 | 0.656597 | 39.039448 | COMPLETE |
| 5 | 4 | 20 | 13 | 4 | MLPClassifier(hidden_layer_sizes=(4, 20, 13, 4... | 0.973913 | 0.991453 | 0.998855 | 0.999708 | 0.622566 | 39.662014 | COMPLETE |
| 6 | 4 | 19 | 10 | 2 | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 2... | 0.982759 | 0.991453 | 0.990748 | 0.999708 | 0.600547 | 40.262561 | COMPLETE |
| 7 | 2 | 19 | 11 | 3 | MLPClassifier(hidden_layer_sizes=(2, 19, 11, 3... | 0.773333 | 0.991453 | 0.999708 | 0.999708 | 0.600546 | 40.863107 | COMPLETE |
| 8 | 4 | 15 | 17 | 2 | MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2... | 0.991453 | 0.991453 | 0.999708 | 0.999708 | 0.600546 | 41.463653 | COMPLETE |
| 9 | 4 | 19 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 4... | 0.982759 | 0.991453 | 0.982168 | 0.999708 | 0.598815 | 42.062468 | COMPLETE |
| 10 | 4 | 18 | 13 | 4 | MLPClassifier(hidden_layer_sizes=(4, 18, 13, 4... | 0.983051 | 0.991453 | 0.999708 | 0.999708 | 0.672611 | 42.735079 | COMPLETE |
| 11 | 2 | 14 | 19 | 2 | MLPClassifier(hidden_layer_sizes=(2, 14, 19, 2... | 0.942149 | 0.991453 | 0.989914 | 0.999708 | 0.603549 | 43.338628 | COMPLETE |
| 12 | 2 | 11 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(2, 11, 10, 4... | 0.773333 | 0.991453 | 0.990024 | 0.999708 | 0.616561 | 43.955189 | COMPLETE |
| 13 | 2 | 12 | 15 | 2 | MLPClassifier(hidden_layer_sizes=(2, 12, 15, 2... | 0.955752 | 0.991453 | 0.998518 | 0.999708 | 0.594541 | 44.549730 | COMPLETE |
| 14 | 3 | 11 | 16 | 4 | MLPClassifier(hidden_layer_sizes=(3, 11, 16, 4... | 0.773333 | 0.991453 | 0.972070 | 0.999708 | 0.662602 | 45.212332 | COMPLETE |
In [9]:
Copied!
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
Out[9]:
{'hidden_layer_sizes': (3, 15, 14, 4)}
In [10]:
Copied!
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
Fit --------------------------------------------- Train evaluation --> f1: 0.9983 ap: 0.9998 Test evaluation --> f1: 0.9718 ap: 0.9947 Time elapsed: 3.048s
Analyze the results¶
In [11]:
Copied!
atom.plot_trials()
atom.plot_trials()
In [12]:
Copied!
atom.plot_parallel_coordinate()
atom.plot_parallel_coordinate()